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ABSTRACT 

This thesis describes the functions of the Database 
Adtrinistrator (DBA) and hov» they are supported by the 
benchmarked relational database rracnine. An examination of 
the relational ouery language provided^ DBA support services 
required, the performance issues involved, and tne security 
features employed is presented. The goal of this work is to 
develop general guidelines for DBA to follow in implementing 
and operating an effective, responsive database system. 
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I 



AN INTRCDUCTICN 



Although the application of software database tranagement 
systems to user regulreirerts is not new there are emerging 
special-purpose hardware systems which will relieve tne nest 
central processing unit (CPU) from tne time consuming 
processes of accessing, updating, and modifying data. 
Numerous, ccmmerclally-avallatle software database 
management systems for the host computers are currently 
employed in application areas but there appears to be 
associated performance degradation in the host machines. 
These performance issues must be identified and performance 
measured in order to provided some quantitative comparisons 
between software system.s, general-purpose hardware systems, 
and special-purpose hardware systems. Historically this 
information has been collected for general-puroose computers 
by the use of the instruction mix (Gibson or Flynn) to 
measure performance in various categories. This measurement 
of a machine using an instruction as a tool mix is calleo 
benchmariclng , 

The task of benchmarking a database system has not been 
developed in the literature. Consequently, a research 
project has been undertaken by the Naval Postgraduate School 
to develop a set of benchmarking standards which can be 
employed to obtain a performance index of a particular 
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database macblne/systeni and furtner used in a corrparative 
analysis with respect to other datacase sy steirs/machines . 

The initial steps in the berchrrark development have been 
limited to a specific relational database macnine. In 
addition to the measurements of specific database 
operations, a question of the role and responsibilities of 
the database administrator (DBA) is posed, with each system 
benchmanced , there is a need to establish the amount of 
support provided to DBA, In this case an examination of the 
facilities provided, query language employed, and amount of 
additional DBA support required is conducted. 

The objective of this thesis is to categorize the duties 
and responsibilities of DBA and describe how they are 
supported by the benchmarked system. At the beginning, the 
system environment is described, followed by a discussion of 
the query language. An analysis cf DgA functions is then 
made and finally, the fully relational miodel is examined and 
a comparison of this particular query language with another 
well-known language is made. 

This thesis is one in a series of four describing the 
current status of the benchmark development. The other 
three topics are on generating the synthetic database (Ref, 
1] , selection and projection [Ref, 2), and join operations 
CBef, 3), 
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II 



THE BENCHIV ASKING ENVIRONMENT 



A. THE HOST SYSTEM 

The host trachine for the benchmark is Unlvac 1100/42 
located at Pacific Missile Test Center, Point «uqu, 
California, In addition tc cr-site eouipirent, a reincte 
teriPinal is installed at the Naval Postgraduate School, 

1 , The Hardv^are interface 

The hardware interface between the host and the 
database machine is through a Univac 1100/42 I/D channel. 
This interface channel has a 200-thousand byte/seccnd 
capacity and the transirlssicr unit is either a byte or a 
word , 

2 , The Software Interface 

The host software is written by Arrperif Corporation 
of Chatsworth, California, This software consists of the 
host-driver routines whose primary purpose is to parse the 
queries and to translate them into the database trachine 
language. Finally, the host handles the comirunicatlcns 
protocol between the database machine and itself, 

B, THE BACKEND DATABASE MACHINE 

1 , A Modular Design 

The database machine which Interfaces with the host 
is an IDM 500 manufactured by Britton-Lee Incorporated of 
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Los Gatos, California, IDM 500 Is being irariceted by Amcerlf 
Corporation as an RDM IlOO, It is a modular, expandable, 
and microprocessor-based system organized around a central 
high-speed bus, Ihe separate modules are functionally 
oriented. The RDM 1100 employs the relational database 
model which will be discussed in detail in Section V, 

The database processor (28000-based microprocessor) 
supervises and manages all system resources. This processor 
executes most of the software in the system. 

The database accelerator is an ortlonal, high-speed 
processor with an instruction set specifically designed to 
perform certain relational database functions. The 
accelerator has a three-stage pipeline which executes 
instructions at up to 10 MIPS, This processor can initiate 
disk activity and can process data at disk transfer rates. 
The accelerator and the RDM IICO software are so configured 
that the most frequently occurring database work is 
performed by the accelerator under the direction of the 
database processor. 

The cache memory (l,e,, main memory) of RDM lioO is 
composed of 64K-bit chips of dynamic RAM, it may be 
expanded to a maximum of six megabytes. This cache is 
utilized tor RDM 1100 system code, disk caching, indices, 
and user commands. 

Disk Controller modules may be expanded from one to 
four. Each controller can manage from one to four disk 
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drives. The disk controller rrcves data oetween the oisxs 



and the cache nreirory, and is designed to work with the 
accelerator. An optional tape control module supports up to 
eight tape drives which can be used for direct, disk-to-tape 
backup, data, and software loading, 

RDM 1100 and the host(s) communicate with each other 
via RDM llOO's host-interface module, Ihls module accepts 
commands from one or more hosts, and acts on those commands 
accordingly. Each host-interface module can handle up to 
eight hosts and a maximum of eight host-interface modules 
can be made available on RDM IlOO, Hence, a m.aximurt. of 64 
hosts can be accommodated by RDM IlOO, In addition to 
communications handshakino protocols, the Interface module 
performs necessary error checks and causes the host to 
retransmit any information block in which an error is 
detected, [Ref, 43 

2 , The System Configuration 

In the configuration described above Cl,e,, the 
connection of the host and the database machine with an I/O 
channel), the database machine is called a backend database 
machine. The term, 'backend', is used in this context to 
refer to a special-purpose machine operating as a peripheral 
device on one or more host systems. As previously 
mentioned, the use of the backend machine can significantly 
reduce the required CPU time for data manipulation by the 
host. Further advantages are realized through treeing disk 
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space on the host ahd the reduction of I/O cycles; thus 
releasing the CPU to perforrr ether functions necessary for 
the proper operation of the system and execution of 
applications programs. 

The performance of FDf^ 1100 is highly dependent on 
the available hardware configuration. Other performance 
issues such as indexing and data positioning are dependent 
on the software developed. The hardware configurations are 
discussed below and the software issues are discussed in 
Section IV, 

Four test configurations are used during the course 
of this research. The initial configuration is of one-half 
megabyte of cache without the accelerator. This 
configuration will not be marketed cy Amoerlf, tut is tested 
for the purpose of comparison. The next test configuration 
is of one-megabyte of cache with the accelerator. Following 
it, a two-megabyte cache with the accelerator is tested. 
Finally, the accelerator is removed from the configuration. 
The configuration is tested with only the two-megabyte 
cache. The standard commercial configuration is with one- 
megabyte of cache. The accelerator is an optional feature. 
For specific Information on the performance measurements, 
the reader is directed to CRef, 21 and tRef, 33, 
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THE RELATICNAL QUERY LANGUAGE 



A, AN INTRQDUCTICN TC THE LANGUAGE 

In addition to the hardv^are and software to support the 
host/backend Interface# Arrperlf also provides a language for 
reouesting Inforrratlon or operations on data frotr the 
backend database irachlne* This language is called the 

Relational Query Language (ROD. The language# being tne 
only Interface for the user and database adrrinistrator 
(DBA)# is the sole means by which the capabilities and 
limitations of the backend are known to the user and DBA, 
Therefore# a discussion about the facilities of RQL will ce 
presented , 

This section defines two major command groups available 
in RQL. The metanotaticn used in tne command syntax 
consists of the symbols described below, 

( ) used as delimiters in RQL 

C ] used to Indicate anytnlno optional Inside 

the square brackets 

I used to denote a choice of the word eltner 

before or after the bar 

( > used to specify zero or more occurrences of 

anything in the curly brackets 
< > used as metasymbols to denote a construct in 
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ROL with the rarre of the construct between 



the metasytrfcols 

All other worcs In ROL are key words and must appear 
literally [Ref. 53. In the reirainina sections key words are 
capitalized. In sections explaining the commands# an 
abbreviated syntax of each command followed by an 
explanation of the command is provided, Ihls information is 
taken from [Ref, 53 and (Ref, 63, 

B, DATA DEFIMTICN CCMI-ANDS 

In ROL the commands are presented without regard to 
function. However, in most database oooks, (e,g,, [Ref. 73 
and [Ref, 83), there is a distinction between the data 
definition language and the data manipulation language. 
Although this distinction is not made in ROL, it provides a 
logical division of the majority of commands and facilitates 
the understanding of the commands. The data definition 
language consists of those commands wnich are used for the 
description of database objects. 

Data can be represented in seven different types in RDM 
1100, The two-character specif Icatlcns available are for 
the compressed (c) and uncompressed (uc) character string 
with the user providing a maximum length, up to 255 
characters. The difference is that the compressed character 
string is not stored with trailing blanks. Integers can be 
declared with three different byte sizes namely, 1, 2, or 4, 
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The byte size of an attribute liirlts the precision which can 
be accorntPodated in the attribute values. Finally, 

floating-point nurrcers can also be expressed as coinpressed 
(f) or uncompressed (uf). The range provided by these two 
forms is identical and of 31 significant digits. As in 
character strings the user must specify the number of 
significant digits desired. Ihe difference oetween 

compressed and uncompressed floating point is the 
suppression of leading and trailing zeros in the compressed 
floating point. Compression is a feature designed to reduce 
the storage requirement in the database, ine following 
declaration is an example of the use of attribute types: 

name = c25, salary = uf8, age = il, address = c200. 

This example establishes four attributes: 'name' whose 

values can each consist of up to 25 characters, 'salary' 
whose values are floating-point numbers each of which is of 
eight significant digits, 'age' whose values are one-byte 
integers, and 'address' whose values are character strings 
of up to 200 characters each. Notice 'name' and 'address' 
are designated as compressed and therefore trailing blan!<s 
of their values are not stored, 

1 , To Create a Database 

CREATE DATABASE <name> [WITH <optlons>] 
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This corrinand Is used to establish a database which 



will be referred to by the user-specified narpe. The two 
options provided are DISK and CEmanD, DISK allows the 
specification of one or more disks on which the database 
will be stored (e,g,, DISK a 'sys')# DE>iAND specifies the 
number of 2K-byte blocks to be allocated tor the database. 
If the database grows beyond the allocated clocks, it may ce 
extended with the following coirrrand: 

EXTEND DATABASE <naiTe> wITH <options> 

The options are identical to the options of CREATE DATABASE, 
2 , To Create a Relation 

CREATE RELATION <nanre> (<attribute naire> = <format> 
{, <attribute naire> = <format>}) twITH <options>] 

The create command is used to establish the schema 
for a relation. An empty relation is sec up in the database 
when the command is executed with the actual specif Icatlon 
of the attributes in parenthesis being given as depicted in 
the example of data types above. One possible option whlcn 
may be declared is LOGGING, This option causes every cnange 
to the relation to be logged in the database transaction 
log. This feature is extremely important to maintain the 
consistency and integrity of the relation when system 
recovery must be initiated. 



19 



3 



To Create an I n d e x 



CREATE CUMGUE3 [NONCLUSTERED I CLUSTERED] INDEX 

CCN] <oeject naire> (<attribute> {, <attr ibute>} ) 

An index on an attribute of a relation provides a 
direct access to the attribute values in the relation, A 
unique index on an attribute requires ail attribute values 
to be different. There are two prirnary differences between 
clustered and nonclustered indices, A clustered index is 
nondense (i,e,, one entry/blcck) wnereas the nonclustered 
index is dense (i.e., one entry/tuple ) , The second 
difference relates to the storage of data. Aithougn the 
nonclustered index does not affect the placeittent of data, 
the clustered index requires the tuples of the relation to 
be stored in the order of the attribute values. 
Consequently, only one clustered index may be created for a 
relation whereas 250 nonclustered Indices may be defined for 
the same relation. For performance data on ooerational 
enhancement provided oy indices, see CRef, 2] and CRef', 3], 

4 . To Create a View 

CREATE VIEW <view natre> (<target list>) 

[WHERE <cualif lcation>] 

The CREATE VIEW coirrrand establishes a virtual 
relation, l.e,, there is no storage of tuples associated 
with the view, A view is a composite relation (without its 
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own tuples) of attributes frotr other relations or views 



The target list is the list of attributes desired from the 
other relations or views. Finally# the ouallf ication allows 
the user to restrict tne quantity of data in the view to a 
particular category and to provide necessary llnKaces 
between the relations or views. 

5 • To Define a Stored Coirrrand 

DEFINE <stored corr.irand rarre> 

<cofrniand> <<co(nrand>> 

END DEFINE 

In ROL the DEFINE ccirirard provides a mechanise for 
creating subroutines in the database machine. Stored 
commands may have parameters or be parameterless , The 
<coirmand> can be an APPEND# DELETE, REPLACE# RETRIEVE# etc, 
(to be discussed later). There are two aovantages to stored 
commands. One is that it relieves the operator of retyping 
a frequently employed command and allows the DBA to provide 
a a simplified method for ir.vcklng complex queries. The 
second and perhaps most important advantage is the 
performance enhancement. Since the stored command exists in 
the database with all addresses of cited relations resolved, 
the com miunlcat ions between the host and the backeno machine 
is reduced to passing an EXEC token and the command name. 
Examples of stored commands are provided in Appendix A, 
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6 • To Destroy a Database 



DESTROY DATABASE <naire> 

The DESTROY DATABASE conrpand eliminates the entire 
database by removing all linkages from RDM liOO and freeing 
the storage space, 

7 • To Destroy an Object 

DESTROY <object name> 

This command eliminates existing relations, 
established views or stored ccmirands from the database. The 
space freed by the command is reusable by the database. As 
indicated previously, views and stored commands depend on 
existing relations or views. These underlying objects are 
said to have dependencies. An object which has dependencies 
cannot be destroyed without first destroying the dependent 
object. This does not apply to indices, wnicn are 
automatically destroyed when tne relation is destroyed, 

8 , To Destroy an Index 

DESTBCY [NCNCLUSTERED | CLUSTERED] INDEX [ON] 

<otject name> (<attrlbute name> 

{, <attrlbute name>>) 

If an index is unnecessary or the overhead 

associated with keeping an index is to high, the index may 
be deleted from a database by the DESTROY INDEX command. In 
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addition to the object naire, tne user must also specify tne 
exact attributes of the index for the purpose of avoiding 
any ambiguity, 

C, DATA ?«ANIPULATICN COM/^ANCS 

The data manipulation language is that subset of RQL 
commands which allows the user to access, update, and 
retrieve the data stored in the database, 

1 , To Retrieve Data 

RETRIEVE [UNIQUE] (<target llst>) [ORDER [BY] <order 
specif lcatlon> [;A I D] 

{, <order specif lcatlon> [:A I D]>] 

[WHERE <quallf lcation>] 

The RETRIEVE command is the most commonly employed 
command in ROL, It is the means by which data is extracted 
from the database and returned to the user. The target list 
provides the user with the facility to reduce the amount of 
data by limiting the number of attribute values reouested. 
The format for the target list is: 

relation. name, attribute. name 

[, relation. name, attribute. name] , 

This list of attributes can be from one or more relations. 
To reduce duplicate information, unique can be employed. 
The order specification dictates the order (l,e.. 
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alphabetic, nurrerlc or alphanuirer ic order) In which the data 



is to be sorted. Finally, the cualif icatlon allows tne user 
to specify predicates which the data trust satisfy and to 
require linkages between relaticns. These predicates and 
linkages reduce the number cf tuples retrieved, 

2 , To Append Tuples 

APPEND [TC] <relation nairie> (< value llst>) 

[WHERE (<qualitlcation>)3 

The APPEND command allows the user to add tuples to 
a specific relation. The value list must specify the 
attribute names and attribute values with an equality sign 
in between. Unlisted attribute values in the value list are 
assigned default values (i,e,, blanks for characters and 
zeros for numerals), 

3 , To Replace Attribute Values 

REPLACE <relatlon name> (<value llst>) 

(WHERE <qualification>) 

REPLACE provides the facilities for updating values 
stored in the database. Although it can only change cne 
relation at a time, the number cf attribute values is not 
limited. Further, more than one relation can be accessed to 
calculate what is to be updated. Although a view name may 
be used in place of tne relaticn name in REPLACE and APPEND 
commands, the numerous restrictions on the acceptability of 
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this procedure irakes It alirost impotent and at best 



Infeasible. 

4 . To Delete Tuples 

DELETE <relatlon-na(re> [WHERE <quallflcatlon>] 

This command Is used to remove one or more tuples 
from a relation. If a wHERE clause is not specified then 
all tuples will be deleted, 

5 , lo Aggregate Attribute Values 

There are six scalar aggregates supplied In RGL 
which may be applied to one cr more attribute values. These 
aggregates return a single value, known as tne scalar, to 
the user. The results of '“'IN and MAX are tne smallest ana 
largest attribute values fcura for the attribute, 
respectively, SUM and AVG provide the arithmetic total and 
mean of the respective attribute values, COUNT returns the 
number of occurrences of the specific attrloute value, ANY 
Is used to test for the existence of a specific attribute 
value. This is accomplished oy appiylna ANY to a condition 
(e.g,, ANY = (relation«name, attribute. name = value)). If 
the condition is true for at least one attribute value a '1' 
Is returned, '0' otherwise. Any scalar aggregate can have a 
predicate (qualification) and, since it returns a single 
value , can be used anywhere a scalar value Is permisslole in 
an expression or other predicate, UNIQUE can be used wltn 
COUNT, SUM, and AVG to avoid including duplicate entries In 
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the computed scalar value. Fcr example, COUNT UNIQUE can be 
used on a personnel database to retrieve the number ot 
different states (assuming birthplace is an attribute) 
represented by the employees place of birth without regard 
to the actual number of employees from each state, Ihese 
scalar aggregates are useful In providing statistics about 
the database and in isolating tuples whose attribute values 
are numeric. For example, a guery can be composed to 
provide a list of attribute values such that each value is 
greater than the average of the values, 
tj . Aggregate Functions 

The term 'function' is misleading when used in this 
context since the results of applying an aggregate function 
is a list of scalars. Although this is not the generally 
accepted concept of a function (returning a single value) in 
the literature, it will continue to be used in this thesis. 
Aggregate functions are used in conjunction witn the 'group 
by' (EY) clause. This clause provides a oartltlon of the 
attribute values. The partitioned values can then oe used 
as arguments of an aggregate function, Tnere can be more 
than one aggregate function in a query, and aggregate 
functions may be nested. Additionally, aggregates may 
appear in both the target list and qualification. An 
example of the application of an aggregate function can be 
found in Section V, 
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The aggregate functians provide the coirrutatlonal 
power of RQL, without the functions there would be no easy 
method of dividing attribute values into sets and performing 
tests or computations on these value sets. Further, the use 
of aggregate functions relieves the user from creating 
numerous temporary relations and from manipulating them 
individually for the desired result. For example, in a 
personnel relation with salaries and department numoers as 
attributes, it may be desirable to compute the average 
salary of each department. This is easily accomplished by 
the use of the aggregate function in the target list as 
follows : 



answer = AVG (salary BY dept-no). 

If this capability were not available, some other form of 
partitioning would be required to support the query, Cne 
might provide a separate retrieve for each department 
number, form a temporary relation for the retrieved salary 
figures, and average on the newly formed relation 
separately, 

7 , String-Manipulation Functions 

In order to maintain the simple format of a 
relational system and yet provide the capability to obtain 
data based on partial or combined attributes, RQL Includes 
three string manipulation functions. The most useful of the 
three for the user appears to be pattern matching. By using 
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syirbols to represent any nuirber of characters, tuples having 
a desired Internal pattern in a specific attribute can be 
selected. Again, consider a personnel relation •i«<ith an 
attribute of date of birth. The pattern matching function 
could be used to provide a list of all personnel born in a 
particular month. Pattern iratcning applies only to 
characters and Is only used in predicates. 

The other two functions are CONCAT and SUBSTRING, 
These functions can be used with character or binary 
attributes, CCNCAT reouires two string arguments, strips 
all trailing blanks from both strings and concatenates the 
second string to the first. The SUBSTRING function requires 
a starting position in the string, a length to define the 
number of characters desired, and the attribute on which to 
perform the operation. For an example of SUBSTRING 
employment, see Appendix A, 

8 , System Supplied Functlcrs 

There are three categories of system supplied 
functions available in RQL, These provide information about 
the database and host, cross reference of system asslgnea 
identification numbers to associated character strings, and 
data type conversions. 

The first group of functions is parameter less and 
provide general information about the host and database. For 
example a user may request the name of the database 
CDATABASENAME C)J, the time or date CGETTI.«£ () or GETDATE 
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()], the attached host [HOST ()], the identity of the CBA 
[DBA ()], or who is executing a cotritand [uStRID ()3, 

The second group of functions is useful in providing 
information in a meaningful form to tne user. There are 
three self-explanatory ccrrirands in this group [RfL-ID 
(relation name), RKL^NAME (relation ID), and FIELD.NA/-E 
(relation ID, attribute ID)). These translations are used 
extensively in Appendix A. 

The last group provides the capacllity to convert 
expressions (exp) from one data type to another. For 
example, a user may convert an expression to a l-, 2- or 4- 
byte integer [INTI (exp), INT2 (exp) or IM4 (exo)J, a 
binary number (BIN (exp)J, or a floating-point number [FLCAT 
(length, exp)]. The expression can be any one of the other 
types listed as well as string and binary coded decimal in 
there legal forms (e.g., compressed and uncompressed), 

D. EXPRESSING THE RELATIONAL OPERATIONS IN THE QUERY 
LANGUAGE 

The power of a relational query language is usually 
measured by its ability to perform the operations specified 
in relational algeora or relational calculus. Since the 
equivalence of the two has been demonstrated [Ref, 8], the 
relational algebra will be used for comparative purposes 
without loss of generality. It should oe noted that RQL is 
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probably ■ best characterized as a domain-based relational 
calculus , 

The relational algebra supports four traditional set 
operations (Union, Intersection, Difference, and Cartesian 
Product) and four special relational operations (Selection, 
Projection, Join, and Civlsicn), All eight operations will 
be defined and an example of each in ROL will oe provided. 
In the examples the term, relation-name, will be abbreviated 
to 'rel' and the term, attribute-name, to 'attr'. 

1 . The selection Operation 

The selection operation provides a subset of tuoles 
in a relation which satisfy a given qualification. All 
attribute values of every tuple satisfying the predicate are 
included in the subset, ROL provides an ALL keyword which 
simplifies the selection operation cy avoiding the 
enumeration of every attribute in the target list, 

RETRIEVE UNIQUE (rel.ALL) wriERE <qualif ication> 

2 . The Projection Cceraticn 

Projection is used to reduce the number of attribute 
values in the tuples which make up the selected subset. In 
addition to limiting the numcer of attribute values in a 
tuple, the projection operation also deletes duplicate 
tuples from the sunset. Deleting duplicates can be enforced 
by using the optional keyword UMGUE, Projection in HOL is 
a function of the target list in the retrieve command, A 
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qualification tray be used to reduce the number of tuples as 
in selection. To reiterate, selection reduces tne number of 
tuples whereas projection reduces the number of attribute 
values , 

RETRIEVE UNIQUE (rel.attrl, rel,attr2, attrn) 
WHERE <qualif lcation> 



3 . The Join Cperation 

The join operation tray be performed on any number of 
relations whose attributes are defined over a common domain. 
The result of the join is a new, higher-degree relation. 
Each tuple, in the resultant relation, is formed oy 
concatenating tuples from the source relations whose 
attribute values satisfy the qualification. 

There are different qualif ications and therefore 
different joins. The equl-jcln is formed over an equality 
predicate. The inequality join is formed over an Inequality 
predicate with an operator such as <, >, <=, >= or i=. The 
following is an example of an eoul-join; the other joins can 
be realized by manipulating the target list (natural join) 
or predicate (Inequality join), accordinoly, 

RETRIEVE UNIQUE (rell,ALL, rel2,ALL) 

WHERE rell , jclnattr = rel2 , jolnattr 
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4 



The Division Cecratlon 



The division operation is defined for two relations 
in which the divisor relation has a degree less than the 
degree of the dividend relation. The resultant relation has 
a degree equal to the difference of the degrees of the two 
relations. The division operation is demonstrated using 

relations rell and rel2 and dividing rell py re 12 where the 
degrees of rell and rel2 are rr and n respectively with m > = 
n. The resultant relation consists of the first (m-n) 
attribute values for each tuple in the dividend , rell, such 
that every tuple in the divisor, rel2, exists as the last n 
attribute values of the uniquely determined partial tuples 
Cldentlcal first (m-n) attribute values! in rell. For 

example if a relation X has tuples abed, abef, beed, and 
abab, and relation Y has tuples cd and ef then X divided cy 
Y would be the relation containing the tuple ab. The tuple 
ab would exist in the resultant relation since abed and abef 
are in X, However, the tuple be would not appear since beef 
is not in X, 

RETRIEVE (rell,attrl, rell,attr2, ,,, , rell ,attr (rr-n) ) 
WHERE COUNT (rel2,attri) = 

COUNT (rell,attn by rell,attrl, 

rell,attr2f ,,,, rell,attr(m-n) 
WHERE rell ,attr(m-n+l) = rel2,attrl 

AND rell ,attr (m-n+2) = rel2,attr2 
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AND 

ANC rell.attriti = rel2.attrn) 

5. The Union Operation 

Union is the traditional set-theoretic definition of 
union with the additional constraint of reouiring the two 
relations to te union-ccrrpatible. Union-corrpatibility 
stipulates that the two relations trust be of the saire degree 
and the correspondina attribute values must be taKen from 
the same domain (e.g,, rell.attric and rel2,attr< must be 
defined over the same domain). The union of two unlcn- 
compatlble relations is the set of all tuples belonging to 
either relation or both relations. Note that duplicates are 
not automatically eliminated cut a RETRIEVE UMGUE 
(union-rel.ALL) can be executed after the following example 
to display the union. 

RETRIEVE INTO union. rel (reil.ali) 

WHERE <quallf ication> 

APPEND TO union. rel (attrl = rel2.attrl# 
attr2 s rel2.attr2, ...» attrlast = rel2.attrlast) 
WHERE <quallfication> 

6 . The Intersection Operation 

Intersection is only defined for union-compatible 
relations. The resultant relation is comprlseo of tuples 
which exist identically in both of the relations. 
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RETRIEVE UNIQUE (reil.all) 



WHERE 


rell.attrl = 


rel2.attrl 


AND 


reli.attr2 = 


rei2.attr2 


AND 


• • • 




AND 


rell .attrlast 


= rel2, attrlast 



7 . The Cartesian-Product Cperatlon 

Given two relations , rel 1 and rel2, of degree in and n 
respectively, the Cartesian product is the set of all tuples 
of degree (m+n) formed by taking the first tuple in rell and 
concatenating to it all tuples (one at a time) in rel2. This 
process is then repeated for the second tuple in rell until 
all tuples in rell have been concatenated with every tuple 
in rel2, 

RETRIEVE UNIQUE (reil.all, rel2.all) 

8 • The Difference Operation 

The difference of two union-compatible relations is 
the set of tuples in the first relation but not in the 
second . 

RETRIEVE UNIQUE (rell. ALL) 

WHERE 0 = ANY (rell.attrl BY rell.attrl 
WHERE rell.attrl = rel2.attrl 
AND 

AND rell .attrlast = rel2.attrlast) 
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This query requires each tuple in rell to be 
compared ’<»itn every tuple in rel2# In the above example, it 
is assumed that rell.attrl is the key for the relation. In 
the event a relation has a composite key, the rell.attrl 
following the BY can be replaced by a linear list of 
attributes comprising the key. 
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IV . THE DATAEASE AC M 1 M ST R ATOR 



The role of DBA is to establish the database and to 
ensure that the database systeir is responsive to the user's 
perforirance requirements and information needs. Although 
the discussion of DBA will use RDH 1100 as the target 
system, the facilities described and DBA support required 
should be applicable to any relational database management 
system. In particular, the amount of DBA support recuired 
does not depend on a particular system. If tne system does 
not provide certain facilities, DBA win be required to 
reformat and/or extract the information from tne database to 
satisfy the users Information needs. Finally, DBA will be 
referred to as an individual; however, the functions can be 
the responsibility of a group of people. 

This section win discuss the functions ana 
duallf Icatlons of DBA in the areas of database environment, 
database design, system services, user services, security, 
and performance enhancement. For each area, a generalized 
statement concerning DBA functions and auaiif Icatlons will 
be provided; then a specific description of the function in 
the RDM 1100 environment will follow, RDM ilOO feature 
which supports it. 
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A, THE DATABASE SYSTEM ENVIRCNMENT 

A software database rnanageirent system is designed to 
support a single database on a general-purpose computer. 
The advantage of a backend relational database machine is 
the support which can be provlaed to multiple hosts and 
multiple databases. The existence of multiple databases on 
a single machine creates two levels of management. Level 
one, the system DBA, is primarily concerneo with machine- 
wide performance and establishing authorizations for the 
database DBAs, Level two, the database DBAs, is concerned 
with the operational data in the individual databases. DBA 
and system DBA should be knowledgeable in the areas outlined 
above to ensure efficient and reliable database performance. 

In RDM ilOO the system DBA has control of a database 
called the system database. Certain commands such as 
creating and destroying databases can be issued only from 
the system database, when a new database is created the 
individual issuing the CREATE DATABASE command will be DBA 
for that database. In this thesis DBA win refer to the 
level-two DBA unless otherwise indicated, 

B, THE DATABASE DESIGN - THE PHYSICAL AND CONCEPTUAL 

SCHEMAS 

AS alluded to above, DBA has numerous areas of concern. 
The second area to be addressed is the database design. 
This topic describes the design of the physical schema and 
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the conceptual scherra, A schetra is slfrply a plan for a 
particular level of the database. The third level, external 
schema, xlll be addressed In Section IV C, 

The physical schema, also called the Internal schema, is 
a plan for the actual storage of data on the physical 
devices available to the database. In RDM 1100 each disk is 
divided into zones of 180 2K-tyte blocks. The first block 
in each zone is reserved for a directory to the contents of 
that block. The number of blocks reaulred for a relation is 
dependent on the number of tuples and the length of the 
tuples. Since the physical schema is a function of the 
database system, the major issue from the DEA perspective is 
whether the system allows the location of data and indices 
to be explicitly specified. 

The conceptual schema is the logical plan normally 
associated with the entire 'organizational view' and 
instituted by DEA, As the physical schema is comprised of 
the actual location and storage structure of the entire 
database, the conceptual schema includes the names of all 
relations, indices, and data dictionary entries in the 
database . 

The primary ouery language subset used to define the 
conceptual and physical schemas is the data definition 
language. The mapping between the physical and conceptual 
schema is performed by the database system, Tnis mapping is 
built as the objects are mace known to the database system. 
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In ROL the CREATE <object> coirirands are the primary 
coitunands employed to specify the conceptual and physical 
schemas. The database system will construct a data 
dictionary for each object. This includes maKlnq the object 
icnown to the system, reservinc appropriate storage, and 
describing the appearance of the object Ce,g,, number, size, 
and type of attributes), in order to design the physical 
and conceptual schemas, DBA must <now the organizational 
structure and must understand database normalization, the 
database system architecture, and the concepts of data 
sharing and ownership, 

1 , Organizational structure 

Since DBA Is responsible for ensuring that the 
database reflects the 'real world' of the organization it 
supports, there is ample justification for a good working 
knowledge of the organization , The objective is 'to develop 
a plan which will accurately reflect the organizational 
requirements without a need to continuously redesign the 
database. Although it is tempting to limit the application 
to one functional area like personnel, DBA must be aware of 
the relationships between the personnel and other entities 
in the organization, without a total oroanizational picture 
DBA will ultimately be faced with redesign to meet the 
organization 's needs. 
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2 1 Normalization 



In order to enhance database reliability and reduce 
redundancy, a solid foundation In relational database desion 
principles Is required. Cne extremely important aspect is 
the DBA's understanding of norma lizatlon. Cnee a specific 
normal form is established for a database, DBA must realize 
the possible implications of deviations from a this normal 
form and should document the exceptions. Normal forms are 
not specifically discussed in this tnesls and the reader is 
directed to tRef, 7] for more infermation. 

The RDM 1100 system, like most existing relational 
systems, requires only that all relations be in first normal 
form. This noriral form stipulates that each attribute value 
in a relation must be atomic. That is, cne value is not 
decomposable. Further, there is only a single-value 
selected from the specific domain for an attribute. Higher 
normal forms must be enforced by CHA. 

3 . Database System Architecture 

DBA must also understand the architecture of the 
database system to exploit efficiencies or avola 
deficiencies. Since database users do not have static 
applications and the data stored is also dynamic, DBA must 
know how to monitor and enhance performance, if possicle, 
when user requirements can no longer be satisfied, 
requirements , 
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4 • Data Sharing and QwrersMc 

One of the prlnfiary reasons for employing a database 
itianagetrent system is to snare the data among users. This 
provides a reduction In the storage of redundant data and 
alleviates the possibility of anomalies associated with 
redundancy. The concept of sharing data must oe tempereo 
with the requirements of user needs and information 
security. Therefore, a means must ce available to provide 
control over the data and to permit the controlling 
authority to decide who will have access to the data he 
controls , 

In RDM 1100 the control of data is a function of 
ownership and access rights. The creator of an object Is the 
owner of that object. Objects wnich may be owned are 
databases, relations, views and stored commands. The owner 
of the object must explicitly permit other users (less DBA) 
to access the object or portions of an object (e,g,, 
specific relation attribute values). For a more detailed 
discussion of ownership and access rlgnts, see Section IV C, 
5 , Recommendations 

In database design the first step is to develop a 
strategy to meet the organizational information 
requirements. Since the conceptual schema is the 
comprehensive data description of the organizational 

information structure, the second step would entail the 
designing of the conceptual schema. By using this approach. 
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data independence can be nraintalned which will prevent the 
iTOdification of applications prograirs due to changes in the 
physical database. Further, the dependencies between the 
conceptual scherra and user requireirents can oe documented to 
ensure changes at the conceptual level will not result in 
changes to the applications prcgratrs. 

It should be noted that DBA must control the 
creation of relations and indices in such a manner so that a 
specific normal form can be maintained. In addition to 
conforming to the itrposed-ncrmal form constraints, each user 
creating relations to satisfy his own needs must not violate 
the relations of the database supporting other users. Such 
violation will certainly contain excessive and redundant 
information, and undermine the initial database design. 
Additionally, if individuals sharing a database are 
permitted to create objects at will, the sharing of oata oy 
all users may be subverted and the database could rapidly 
deteriorate to a user-determined tiling system, 

C. THE DATABASE DESIGN - THE EXTERNAL SCHEMA 

As important as the physical and conceptual schemas are 
to the implementation of a single database, the 
establishment of the external schema is critical to the 
users. In considering divergent user application 
requirements, the external schema provides the means to 
define precisely what will satisfy each users information 
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needs 



The external scherra of the database is different for 



each user or group of users. These scnerras are composed of 
subsets of the conceptual schema. The definition of the 
contents of a particular external schema is normally 
accomplished through access control of objects existing at 
the conceptual level. By restricting the relations, 
attributes, stored commands, and/or views available to a 
user, a subset of the entire database is cefined, 

A user's access to the database is determined by the 
user's access rights. The access rights of a user are 
authorized by DBA and consequently DBA controls user access 
to the database. In addition to the verification and 
matching of host ID and host-user ID to the database 
system-user ID, these access rights are the only means for 
access control in the majority of oatabase systems. In RGL 
the PERMIT and DENY commands on physical objects, virtual 
objects, and stored commands can be used to establish the 
various external scnemas of the database, 

1 , Permit/Ceny Access 

There are two access rights which must be available 
in a database system to provide a user with the appropriate 
level of information. These access rights are read and 
write. Execute privileges can be considered a special case 
of indirect read/write just as create can be a special case 
of write. 
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The two comtrands in ROL which assign the access 



rights are FEP)*'IT and DENT, The PERMIT command grants a 
user a specified access right over an object or command and 
the DENY command revokes or removes such access rights, 
DENY Is primarily employed to revoke a previously granteo 
PERMIT, 

The access rights available In RGL are READ, write, 
EXECUTE, and CREATE, PERMIT READ provides access tc the 
specified objects (relation, view or named attributes of a 
relation or view). To modify or add data to existing 
relations In the database a PERMIT wRiTE for the user or 
group of users on the objects cr portions of objects must be 
explicitly authorized. The keyword all can be used to grant 
read, write, and execute privileges to a user or group of 
users. Only the owner of the object is authorized to DENY 
access to the object. 

There are two cases of implicit access In RDM HOC, 
DBA Is authorized access tc all objects In the database to 
which he has not been explicitly denied access by the owner. 
Even If access Is denied to DBA by the owner, DBA may still 
destroy the object by deleting all references to it in the 
database relations (non-user). Additionally, the owner of 
any object is permitted access to that Object, All other 
accesses must be authorized by the owner of the object. 
This Is the essence of the access control system. 
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2 . Create Physical Objects 



The database inanagetrent system must provide 
facilities to create physical objects in tne database. 
Initially, the database trust be created with an assigned 
DBA, Following this, the physical objects in tne database 
must be created. Although an index is not a physical object 
which may be manipulated by the user, it is discussed in 
this topic. 

In RQL the right to execute the CREATE DATABASE 

command must be explicitly granted by the system DBA, Once 

# 

a user has authorization to create a database, the execution 
of the CREATE DATABASE command maxes him DBA of the named 
database. To add new users to the database, C3A employs the 
database administrator utilities (DBAUj program. The DEAU 
NEVk. USERS command assigns the host-user ID and host ID and 
places them in the HCST.USERS relation. The DESTRCT 
DATABASE command can only be executed by DBA, (l,e,, tne 
owner of the database), 

CREATE <objects> is also controlled through the 
PERMIT and DENY commands. The permission is similar to 
CREATE DATABASE in that only DBA for a particular database 
may authorize the creation of objects. Relations and 
Indices are the physical objects which a user m.ay be 
authorized to create. Only the owner of a relation is 
allowed to create an index on the relation using the CREATE 
INDEX command. However, the DBA must authorize the owner 
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use of the CREATE INDEX ccirirand 



Each of the above 



discussed cotrirands has a counterpart for revocation. Only 
the owner or C8A tray destroy relations and Indices, 

3 , Create virtual Objects 

Once the physical objects are created. It Is 
necessary to create the virtual objects in the conceptual 
schema wnlch will define the external schema for each user. 
Views are the virtual objects which a user may be authorized 
to create, 

CREATE VIEW requires the user to have access to the 
relations over which the view is defined. Only the owner of 
the view may destroy the view with the DESTROY <ooject> 
command , 

4 , Access via Stored Commands 

Finally, an Indirect read/write (execute) access is 
necessary to allow users to extract Information from the 
database through the use of stored commands. Stored command 
Is an RQL term for a user defined function or procedure. 
Although this feature may not be available In every database 
system, It Is very useful and powerful when provided. In 
addition to the efficiency Issue of stored commands 
discussed previously, It Is much easier for the user to 
execute stored commands than to Input long queries. In POL 
PERMIT EXECUTE allows a specified user or group of users to 
execute stored commands. 
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5 • Reconpn’endatlors 



There are three irethods for providing an external 
schema to a user or group of users. The first method is 
through restriction of access or the physical objects in the 
database. The second method is to define virtual relations 
which consist only of the necessary subset of data the user 
is required to access. Finally, the third metnod entails 
the extraction of information from the database through the 
exclusive use of stored commands. 

In RQL the major problems with the first two methods 
are the addition and deletion of data and implementation of 
ALL, As mentioned in Section HI there are too many 
restrictions on the use of views for updating database. 
Additional problems can arise using the first method as a 
result of the system assigning default values to attributes 

V 

wnlch are not explicitly listed in an APPtND command. For 
example, an Insertion of a tuple with a blank key field 
(employee number) for a new employee's salary and name would 
result in a tuple containing the employee's name and salary 
with a blank key field. A separate insertion containing the 
employee number and name would result in two tuples in the 
same relation for a single employee. 

The stored command can be executed without granting 
the user access rights to the relationCs) which are accessed 
by the command. However, exclusive use of stored commands 
for information retrieval is not reasonable since 
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anticipation of every query which a user could possibly 
require is not possible. 

There is a trade-off between access control, 
performance, and relational perspective. Each of these 
issues requires the sacrifice of one of tne trade-cff 
features. In order to resolve this problem, a combination 
of the prescribed methods is required. The use of stored 
commands to input and delete data in a well structured 
database removes the restrictions on tne use of views. 
Further, stored commands can force the entry of all 
mandatory attribute values for a tuple through parameter- 
argument matching which eliminates the duplicate tuple 
problem described above. Combining stored commands for 
updating with the use of views to define tne external 
schemas would provide the most logical approach, in order 
to employ this strategy a major system change is reoulred in 
the Implementation of ALL, 

First, it should be obvious that the most logical 
mechanism for producing an external schema is the view. 
However, the major problem is the necessity to provide 
access to the ATTRIBUTE relation to permit the use of ALL 
with the view name. Therefore, ALL should be implemented 
such that only the attributes or relations the users are 
authorized to access are returned. This should not carry an 
implicit access to the ATTRIBUTE relation. Access to the 
ATTRIBUTE relation can be restricted by implicit use ot 
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user-id predicates on all queries on data dictionary 
relations. The performance issue results from the 
implementation of ALL in the host and the resultant 
communications between the host and tne backend to process a 
query containing ALL, This performance degradation can be 
rectified by Implementing ALL in the backend relational 
database machine, 

D, SYSTEM SERVICES 

The third area is the services provided to DBA by the 
system, DBA will use these services to facilitate system 
backup, crash recovery and provide information about the 
database. The system services establish a nucleus ot 
Information and facilities which DBA may be required to 
augment for his own personal preferences and needs, 

I , System Backup 

Two areas of system backup must be provided to DBA 
to ensure proper system furcticning. The first area is the 
necessity of providing a means to record the contents of the 
database when it is in a consistent state, Tnls is employed 
most frequently by the system DBA and is addressed further 
in the next topic on crash recovery. 

The second area is the need to return the database 
to a previous consistent state as a result of aborting a 
transaction, A transaction is a single command or a series 
of commands which must oe left uncommltteo until the final 
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coir.rrand has finished. This situation can result frotr a user 



decision to abort his transaction prior to completion or the 
necessity of rolling a transaction pack as a result of 
deadlock. Deadlock occurs In a multiple user system when 
one user holds a resource (e,g,, relations) another user 
requires and the second user holds a resource tne first 
requires. In this situation, the system is said to be 
deadlocked since neither user can complete his transaction. 
To resolve deadlock only one of the transactions must be 
rolled back. The solution to user aborts is to restore tne 
database to the state it was in prior to the abort. 

In BDk 1100 the function of backing up transactions 
is invisible to DBA, The TBANSACl relation (to be discussed 
later) is used to maintain the before and after attribute 
values affected by the transaction for relations createo 
with the logging option. The BATCH relation is used for the 
other relations, A transaction is by default a single 
command unless the explicit ccnrrrands BEGIN before and END 
TFANSACTICN after a group of commands is specifieo, abcRT 
transaction can then be issued after BEGIN and before END to 
cause rollback, RDM 1100 employs an optimistic concurrency 
control algorithm which does not prevent deadlock from 
occurring. The resolution of deaclock is completely 
invisible to the user and DBA, 
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2 • Crash Reccvery 



Another facility which irust be provided to C6A is 
the ability to recover froir a system malfuhction . This is 
particularly irrportant when the data on the disX has teen 
lost or contairinated . To avoid excessive time delays, 
periodic copies of the entire database must oe made to 
reduce the amount of updating. The frequency of copying the 
database is dictated by the number of changes in a period of 
time and the time demands of the applications programs. The 
normal method of recovery requires the m.ost recent copy of 
the database and the transactions whicn have occurred since 
the copy was made. Once the copy of the database is loaded, 
the transactions are rerun to bring the dataoase up to date. 
Since the chronological list of transactions is the key to 
recovery, it must be copied from, the database on a frequent 
basis even though the copying of the entire database may be 
less freduent due to the time required. Of course, some 
transaction which were in progress or not in a transaction 
list must be reinitiated by the user. 

RDM 1100 provides DUMP CATAeASE and LOAD DATABASE 
commands in the DBAU facility. Additionally, DUMP 
TRANSACTION is provided to make copies of the transaction 
log. The command which allows rerunnlno transactions after 
a LOAD DATABASE command has been executed is ROLLFCRwapd, 
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3 « SvstenT' Information 



Ihe database system eirployed must provide a data 
dictionary and statistical in formation on the database 
configuration and performance, A data dictionary contains 
descriptive information about the database. It must Include 
all the various schemas (physical, conceptual, external) and 
should include cross-reference information such as which 
programs use what data and synonyms. 

In 1100 there are 13 system-supplied database 
relations which contain descriptive Information about the 
associated database. In addition, there are seven system 
relations which provide a global description of the database 
machine , 

The system relations provide a catalog of the 
databases in RDM 1100, a list cf dlSKS known to the system, 
status and types of locks in the system (used for concurrent 
processing), and the configuration of the communications 
Interface to the attached hcst(s). Another system relation 
provides information concerning tne activity currently 
taking place in the database. Two additional relations are 
used to provide performance data. 

Perhaps more Important for CBa are the 13 relations 
associated with each database, tach relation is listed 
below and a brief description of tne type of Information 
contained is provided. The first 11 are used to supply data 
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dictionary Inforiratlon and the last two provide Information 
related to transaction iranagenrent , 



RELATION NAME 


CESCRIFTICN 


RELATION 


A Single tuple is provided for each 
object in the database. This tuple 
Includes, as appropriate, the name of 
the object, owner, relation identi- 
fication number, size, location, 
number of tuples and their length, 
type of object (user, system, trans- 
action log, file, view or stored 
command), and the number of 
attributes . 


ATTRIBUTE 


A tuple is entered for every attribute 
in the database. This tuple Includes 
the attribute identification number, 
data type, maximum length, associateo 
relation ID, and attribute name. 


INDICES 


Each index has a tuple in the rela- 
tion, Ihe attributes include the 
index identification number, relation 
ID, number of attributes in the index, 
location, and attribute ID(s), 
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PROTECT 


Contains irfcrn-ation assoclatea >rltn 
the explicit access authorized on ocjects 
for users in tne database. 


QUERY 


Contains the stored comirands and 
views . 


CROSSREF 


Describes the dependencies arrong 
relaticns, indices and stored 
coirmands in the datacase. The deoend- 
encies are system defined and not user 
specified. 


USERS 


Describes the mappings between user 
identification numbers^ names, and 
user groups. 


HOST. USERS 


Defines the mapping net ween the nost 
ID, host-user ID, and RDM iloo user ID, 


BLCCKALLCC 


Catalogs the sector assignment within 
a zone. Each tuple represents a sector 
and the assigned ocject. 


DISK. USAGE 


Describes database disk allocation. 


DESCRIPTIONS 


Contains user-specified, textual 
descriptions of objects and attributes 
in the database. 
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BATCH 



Contains tetrForary logging inforitaticn 



used by the system for transaction 
management. This relation orovldes 
information on transactions against 
logged and non-logged relations so 
they can be canceled if required. 

The transaction Information is held 
until the transaction is committed, 

TRANSACT Permanent logging information used 

for crash recovery. 

All of the above relations provide the comprenenslve 
picture of the database. Although the information is in the 
relations, much of it is rot in a usable format. For 

example, only the RELATION relation contains the textual 

name of a relation. Other relations use the internally 
assigned relation ID. Further, some of the information is 
encoded. In order to translate this Information into an 
understandable format DBA must develop stored commands 
(preferable to ad hoc queries). The number of stored 
commands will be dependent on the desires of DBA, However, 
a minimum subset should include commands to list the 
relations, attributes, indices, attributes in an index, 
access list associated with an object, description of an 
object, and oependencies . 
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The following stored ccirn-ands are used to yield the 
fninittiuin subset, TABLES is used to provide a list of objects 
by type (CBA-supplied paraireter to lAeLES), For relations, 
relation name, type, and the nunrber of attributes and tuples 
in the relation are provided, FIELDS provides the relation 
name (parameter specified by DBA to FIELDS), attribute name, 
data type of the attribute (bin, char, int, etc,), and the 
length of the attribute for every attribute in the relation, 
ALL-INDICES provides a list of all indices on user relations 
in the database. The Infcriration provided includes the 
relation name, index identification number, number of 
attributes in the index, and a narrative description of the 
type of index. An additional command, INDEX-LIST, is used 
to provide the same Infcrmaticn as ALL-INDICES except a 
relatlon/vlew name is passed as a parameter and only the 
Indices on that relation/ view are returned, ATT-lN-INDXl 
and ATT-IN-INDX2 are used to list the attributes in an index 
by name. These commands require two parameters; the index 
ID and the relation name. The reason for the development of 
two separate commands is the readaplllty of the output, 
PRCTECTION provides the object name, user name, and type of 
access authorized for an object which is passed as a 
parameter. Another command, ACCESS-LIST, is provided to 
describe an object and the associated access list for a 
particular object, wbatis provides a narrative explanation 
of its parameter from the CESCPIFTICN relation, DEPENDS is 
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used to provide a list of the dependencies on an object 



Finally, another useful cotrirand is aHocREATES which provides 
a list of users who have ceen granted create peritission in 
the database, RQL constructs for the stored coirirands 
described above are provided in Appendix A, 

4 . Translator 

Upon iirpleirentation of a relational database, it 
will be necessary to load the data into the system. Since 
the data exists on some storage device (disx, tape, etc,), 
there should be a mechanism fcr presenting the data to the 
system for immediate loading in a relational format. 

In RDM liOO, assuming the data can be collated as a 
seguence of records on a disk cr tape, tne 'translator' can 
then be used to load the database on a reiation-by-relation 
basis. The 'translator' will ask a series of guestions to 
ascertain the incoming data format and establish the 
relation schema. The following guestions must be answered 
for a relation. The answers are parenthesized, 

1, Output directly to the PC^^? (y/n) 

2, Input file (name) 

3, Database (name) 

4, Name of table (relation name) 

5, Name of 1st field (name of first attribute) 

6, Enter input type and length (input file format) 

7, Enter output type and length (cl2, il, etc,) 
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8, Starting position (Input file) 

(Questions 5 through 8 are repeated for each 

attribute, ) 

9, Record length (input file) 

E, USER SERVICES 

The fourth area Is DBA support provided to the users of 
the relation database systetr, DBA should provide 

servlces/f acuities to the users of the database depending 
on their applications and experience level, a discussion of 
user services In two general areas will be addressed. These 
areas are providing a help facility and providing stored 
cornirands . 

1 . Help Facility 

As with any Interactive systetr# a help facility is 
required to preclude tiire-ccnsutring # tr lal-and-error 
corrective procedures. For a relational database systetr the 
help facility should include# at a pninltrurr# the syntax and 
explanation of every language ccrpirand and an explanation of 
the stored corrrnands# relations, and views. 

In RCL this can be accoirpl Ished by creating a help 
relation with three attributes (object, line number, ana 
text) and defining a stored ccmirand wnich given an object as 
a parameter will explain Its purpose or use. The storea 
command must contain appropriate preoicates in tne WHERE 
clause to ensure the user can only retrieve Information from 
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the help relation about objects which he is authorized 
access. An exairple of the help relation and stored comniano 
is provided in Appendix A, 

2 , Stored Coffirands Provided by DBA 

DBA irust have an in-depth knowledqe ot the query 
language. It is not reasonable to assume that the average 
user will become proficient in tne use of the query 
language. Both query language complexity and performance 
issues must be considered. The examples in sections III and 
V demonstrate some of the complexities in RGL, DBA will ce 
required to assist the user in the proper formulation of 
some queries. In addition, the users will look for 
assistance when confronted with any perceived problem in the 
database. Since DBA is a database expert, the user will 
naturally request his assistance. 

In addition to applications oriented RQL stored 
commands, which are not discussed, DBA should provide 
commands similar to those described earlier in this section 
for the user. Specifically, DEPENDS, WHATIS, FRCTECl XCN , 
AIT.IN.lNDXl , ATT-1N-INDX2, INDEX. LIST, FIELDS, TABLES, and 
HELP Should be provided. The only difference between the 
DBA commands and the user's stored commands is the inclusion 
of the necessary predicates in the wHere clause to limit the 
response to data which the user has been granted access. 
Other minor modifications may also be desired. For example, 
TABLES could be parameter less and return all relations. 
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views, and stored coirrrands to wnich the user has access, 
PROTECTION could be rrodifled to return only the accesses on 
objects the user owns, 

F, SECURITY 

The fifth area for DBA concern is the security of the 
database. The security of a database systeii' is plagued with 
the same problems associated with computer security in 
general. The normal mechanism for security is access 
control. Since a database system is attached (backend) to a 
host, the security measures provided by the host are the 
first level of security afforded the dataoase system. The 
user ID-password logon procedure employed by general-purocse 
computers can be used for database systems to provide the 
same security checks, Additlcnaliy , a host ID check in 
conjunction with the previously mentioned validation can be 
performed when a backend system is used. Security is also 
afforded by the backend machine configuration since the 
database machine is separate from the host and uses its own 
disks for data storage. 

The first security check performed cy RDM ilOO is the 
verification of the host and host-user ID. These IDs are 
verified each time a request is made from tne host to the 
backend machine. Since the security of the database is 
closely associated with the security of the host, the use of 
passwords on the host for identification and verification is 
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essential. The user ID-passwcrd logon procedure is not 
employed In FCNt 1100 but Is taken from the nost which means 
there Is not an additional IC-password required tor the 
backend machine. In addition to the verification and 



matching of 


host 


ID 


and host-user 


ID to 


the 


database 


system-user 


ID, 


the 


access control 


rights 


are 


the only 



security mechanisms available in FDM 1100, 

There are two implicit access rights in RDM iioo. The 
owner (creator) of any object and DBA are permitted access 
to that object unless explicitly denied by tne owner. All 
other accesses must be authorized by the owner of the 
object, Thisls the essence of the security system. The 
remainder of this topic win discuss specific security 
weaknesses in the RDM 1100, 

1 , Security Aspects of *ALL* 

A crucial aspect for security is the Implementation 
of ALL, ALL is used in a query as a synonym for every 
attribute of the relation In the target list. As previously 
discussed, there is not a user ID qualif Icatlcn associated 
with ALL, Therefore, the translation of ALL to its 
attribute equivalents Is based on tne object (relation or 
view), ALL does not work with a view or a relation unless 
the user has READ access on the ATTRIBUTE relation. 
However, once this access Is authorized, the user can 
examine the entire conceptual schema which is certainly a 
violation of security. 
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2 • Systefn Messages 



The use of relation. attrlbute(s) or 

vlew.attribute(s) in the target list returns two seoarate 
error messages if read access to the object is rot 
permitted. One error message (permission denied on ...) 
Indicates the attribute name is valid but access is not 
authorized. The other error message (.., not found) can be 
interpreted as the attribute name is non-existent. Although 
extremely tedious, the errcr messages can be used in a 
trail-and-error method to obtain the conceptual schema. 

3 . User Identification Numbers 

Another serious weakness in tne security of ftOL is 
the deletion of a user from a database. The easiest method 
is to delete the user from the HCST«USERS relation which 
will prohibit him from opening the database. However, if a 
new user is added to the database from. DBAU and the system 
assigns him the UID which was previously assigned to a 
deleted user, the new user will inherit all the accesses 
which were established by DEA and owners for that uiD. This 
is not acceptable since there should not be any implicit 

I 

authorizations for a new user. 

4 . Recommendations 

The recommendatlcns for correcting the 

implementation of ALL are discussed in Section IV C5 above. 
Although not as informative, the return of a single error 



message for both access denied and relation. attribute not 



found would provide less inforiration about the conceptual 
schema of the database. Frcir a user's perspective it does 
not appear to be significant whether access Is denied or the 
object is not in the database. The critical issue is to 
avoid divulgence of the conceptual schema to a user not 
authorized this Information, 

The two methods for correcting the user ID problem 
are the explicit deletion of all access rights in the 
database (PRCTECT, USERS, HCST. USERS) for the old user by 
DBA, or providing a command in the DBAU to delete a user 
from a specified database which will explicitly remove all 
the accesses the user has been granted. The second metnod is 
preferable to the first since the system should orovlde this 
service to DBA, 

G, FINE-TUNING PERFORMANCE 

The last area of concern for DBA is the performance 
enhancement of an existing database. Given that a 
relational database system has already been selected and the 
overall performance factors have been established, DBA 
nevertheless does have a few tods at his disposal which can 
enhance performance. There are features in tne ouery 
language implementation which are more efficient than 
others. For example, a join can run faster depending on 
which relation is held in cache. One lanauage uses the last 
relation listed in the query if other factors are equal. 
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Thus, the order of the relations could be itrportant. 
Another exairple is the use of parenthesis to resolve 
aicbiguity in a list of logical predicates. These features 
are highly lire leinentat ion-dependent and >iill not be 
addressed further. The other three features are data 
reorganization, indices, and data placeirent. 

In ROL DBA v»lll be reoulred to develop a performance 
ironltoring strategy vdhlch iray Include tne periodic execution 
of stored coirirands specifically designed to collect 
perforirance data, 

1 , Data Reorganization 

As data is added to and deleted froir the database, 
there is an associated fragirentatlon of relations in 
physical storage. Even though rrany database systems provide 
the capability to reserve extra space tor relations, this 
will result only in a delay of fragmentation. The extent of 
fragmentation must be monitcred and fragmentation eliminated 
when necessary, 

DBA may specify the number of blocks for a database 
and for a relation in RGL, Additionally, FILLFACTORs can be 
specified for clustered indices on relations. This 
FILLFACTOR determines the percentage of each disk block 
which will be used for the data in the relation when a 
clustered index is created. When the fragmentation becomes 
excessive, the clustered index can be destroyed, recreated, 
and a new FILLFACTOR assigned. This procedure will resort 
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the data in the blocks available for the relation. A 
relation will be allowed to crow until it uses all the 
blocks it is authorized or all blocks in the database are 
full. Since blocks are not re-used when data is deleted 
froir a relation, this will result in reaching rraxitnum block 
capacity and fragirentatior. . CfiA can monitor this activity 
by writing a stored command on the BLQCkS relation. The 
ability to eliminate fragmentation for a non-indexeo 
relation will depend on the number of free consecutive 
blocks available in the database. If enough blocks are 



available 


, the data 


can 


be 


retrieved 


into 


a 


temporary 


relation 


defined 


over 


tbe 


empty blocks. 


the 


original 


relation 


destroyed , 


and 


the 


temporary 


relation 


renamed , 



This strategy can also be employed when reclustering does 
not offer a satisfactory solution to fragmentation, 

2 • Indices 

Indices can enhance the performance of a database 
for data retrieval, CRef, 2] and tRef, 3] have documented 
the actual enhancement in PCM 1100, Since indices are 
application-oriented, they are highly desiraole for 
databases where the majority of operations are retrieval of 
data over large relations or relations wnich are fairly 
static can be identified. If numerous update and apoend 
transactions are envisioned, then a degradation in 
performance could result due to the constant updating of the 
indices. Therefore, DBA should be aware of the size of the 
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relations and types of operations performed on the 
relations. For example, 1£ Insertion Is prevalent then 
avoidance of Indices on the relations which require numerous 
APPENDS, If possible, may reduce degradation, 

3 , Data Placement 

Hypothetically, the placement of data on disks can 
enhance performance. For example. If a join between two 
relations Is performed frequently, then placing the 
relations on separate disks will reduce disk head micvement 
as data Is moved Into cache. Although this hypothesis has 
not been verified due to the lack of facilities for placing 
data in RQL, the data placement strategy should ce 
considered when explicit assignment of physical storage Is 
available. This could be even more significant when 
processing data on-the-fly is reailzeo, considering the 
speed discrepancy between reading data and moving disk 
heads , 
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V 



EVALUATION OF THE RELATIONAL SYSTEM 



A, THE FULLY RELATIONAL SYSTEM 

1 . The Fully Relational Cnaracterlstlcs 

The definition of a "fully relational" database 
tranageirent systetr is given ty Chris Date [Ref, 7], Date 
suggests that most existing systems are not fully 
relational. The primary benefit of considering fully 
relational as a standard and goal for itrplementatlcn is in 
the algebraic pc*er of the language and tne consistency of 
system supplied functions. If the system is deficient in 
any characteristics which Cate describes, appropriate action 
may be taken to provide a senrtlance of a fully relational 
system. First, the concept of fully relational is defined; 
then a comparison of RDM 1100 and RUL to the definition is 
addressed . 

In order for a database to ce character ized as fully 
relational it must support tne following; 

a. "relational databases (including the concepts of 
domain and key and the two integrity rules);" 

b. "a language that is as powerful as the relational 



algebra 


(and 


that 


would 


remain so even 


if 


all 


facilities 


for 


loops 


and 


recursion were 


to 


be 


deleted ) . " 


[Ref, 
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A relational database 



exhibits the following 



properties : 

a. Relations are in first nortrai form. 

b. Associations between relations are explicitly 
connected through common attributes, 

c. Every value appearing in a given attribute is taken 
from the domain for that attribute, 

d. Every relation has a unique primary key which 
distinguishes (Identities) individual tuples. 

In addition to the above properties, two integrity 
rules are required. First, a null value is not permissible 
as an attribute value of a primary key. Second, if a 
relation A has an attribute value which is also the primary 
key of another relation 8, tnen at all times the attribute 
values in relation A must exist in b. This rule prevents the 
missing linkages among relations when attribute values are 
added to relation A or removed from relation B, 

2 , Four Areas of Deficiency 

There are four areas in which RDM ilOO does not meet 
the requirements for a fully relational system. First, 
although specification of the schema includes data types for 
each attribute, no notion of an underlying domain is 
Incorporated, Since attributes are defined by general 
length and type comparisons of attributes are limited only 
to similar types (e,g,, character with character), 
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ffeaningless coirparisons are allcwed, n^ithout the concepts of 
sets, enurrerated types, and ranges avallaoie in hloher order 
languages such as PASCAL or ADA, the support of doiviains will 
always oe questionable. 

second, the requireirent for a unique primary Key Is 
not enforced. The uniqueness of an attribute value can be 
enforced by declaring a unique Index on one of the candidate 
Keys, However, this associates an access path with the 
concept of a Key, These are two logically separate Issues 
and as such should be dealt wltn separately, since the 
e.xlstence of a candidate Key does not imply the need tor an 
access path on that attribute. 

Third, nulls are not Implemented in RDM ilOO, 
However, the default values for integers (zero) and 
characters (blanKs) are provided for unspecified attribute 
values, Tuple(s) may be entered into a relation without 
values for the key fields. Even if unique attribute values 
are enforced through Index specification, at least one tuple 
with the default value In the key attribute will be 
accepted , 

Finally, relations are normally connected through 
the repetition of some (or all) of the key attribute values 
In one relation A and in another relation B, However, there 
is no mechanism to ensure relation E does not contain a 
value in the connecting attribute which does not exist in 
relation A, 
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3 . The Relational Ccn-pleteness 

RDM 1100 perfornns all the relational algetra 
operations defined in Section III voith one exception. This 
exception deals with the eliirlnation of duplicate tuples in 
the results after applying certain operators (projection, 
division, natural join, etc,), Fcr example, althougn the 
result relation may appear tc satisfy a natural join, it is 
obvious that duplicates are net a priori eliminated, since 
the elimination is a function of the associated projection. 
Additionally, a projection of an attribute in a relation 
with duplicate entries will return all the values in the 
attribute without regard to duplicates, A join could be 
simulated by forming a Cartesian product of the two 
relations, applying the predicates to the product, 
extracting the concatenated tuples which satisfy the 
predicates, and projecting the attributes from the target 
list. 

B. COMPARISON CF TWO QUERY LANGUAGES 

This topic provides a ccmparlson of RGL and SQL, The 
selection of SQL as the comparative language is based on the 
relative familiarity of a large number of people with the 
language and its widespread use. 

1 , .Equal Power 

The power of the two query languages is practically 
identical. Both languages are relatlonaily complete which 
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implies 



a. Any relation derivable from the database relations 
using an expression in the relational algebra can be 
retrieved using the language# and 

b. Any derivable relation can oe retrieved using a 
single statement in the language, 

2 • Differences in the Syntax Structure 

The major difference bet ’ween SQL and RQL is the 
syntactic structure. Using the database in Figure l from 
Date# an example of the two query languages will be given. 

This example is a query to list the names of all 
suppliers who do not provide part ”P2'', As can seen from 
inspection of Figure 1 the answer would be one supplier# 
ADA/«ts, 

SQL; SELECT SNAME 

FRCM S 

WHERE 'P2' 1= ALL 

(SELECT P-NR 
FRCM SP 



The query as stated in RQL is: 



RQL! RETRIEVE (S,SNAME) WHERE C = ANT (SP,P-NR BY 

S,SNAME 

WHERE SP,P.NR = "P2" 

AND S,S_NR = SP,S.NR) GO 



71 



s 



S.NR 


SNA>«E 


STATUS 


CITY 


SI 
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20 
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S2 
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S3 
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30 
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P2 
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P3 
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P4 
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RED 


14 


LONDON 


P5 
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BLUE 


12 


PARIS 


P6 
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19 


LONDON 
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S.NR 


P-NR 
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SI 


PI 
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SI 


P2 


200 


SI 


P3 


400 


31 


P4 


200 


SI 


P5 


IOC 


SI 


P6 


100 


S2 


PI 
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S2 


P2 


40C 


S3 


P2 


200 


S4 


P2 


200 


S4 


P4 


30C 


S4 


P5 


400 



Figure 1. The Supplier-Parts Database 
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without regard to Irrcleirentation the above queries 
are resolved as follows: 

a. In the SGL example the sets of suppliers and the 
parts they supply Is formed by. the nested select. Then 
the outer select will return a supplier's name, if ana 
only If the set of parts supplied oy tnat supplier dees 
not contain "P2", 

b. In the RGL example the "by" clause establishes the 
same set as the inner select of the SQL query. Then the 
two boolean expressions are evaluated with the "and" 
conjunction , If no tuples satisfy the conaitlons for a 
given supplier, then the value of ANl (tuple) Is 0, If 
ANY is 0, the oualif Icaticn evaluates to true, and the 
suppliers name Is returned, S.S.NR = SP.S-NR insures 
that suppliers In the S relation but not in the SP 
relation are not ignored (i,e,, that a succller who 
supplies no parts will be included as a tuple in the 
answer to the query). 

The syntactic structure of the example demonstrates 
the major differences in the two languages. SQL is highly 
structured, with nested selects. On tne other hand, fiOL 
does not permit nesting of retrieves out allows nesting of 
aggregate functions to perform the same operations. 
Although it would be purely subjective to favor one method 
over tne other, it appears tnat the structured approacn of 
SGL may be easier to learn initially. However, once the 
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aggregate functions of RGL are irastered, 



the lack cf 



redundancy rnay be irore attractive, 

3 , Other Differences 

RDM 1100 does not inrplenrent nulls, but does supply 
default values (zeros for ruireric fields, blanks tor 
character fields). Therefore, tne results of the scalar 
aggregates and aggregate functions (AVG, min, MAX, ana SUM) 
are not always predictable. This iirpiies the user must be 
extremely knowledgeable about the database and use the 
aggregates with caution (e,g,, explicitly exclude zero 
values from aggregation). In SQL Queries can be constructed 
with "no null" as a Qualifier and the tuples with null 
values in the attribute being aggregated will not be 
included in the returned value, 

SOL uses 'ALL', 'HAVING', 'IN', and others to 
provide a more set-theoretic description of database 
manipulation, ROL provides the same capability in the 
aggregate functions but the concept of set manipulation is 
not explicit, ROL provides a 'MCC' function and some string 
manipulation functions which are also available in SQL, The 
string manipulation functions extend the power of HOL 
particularly when working with database relations (i.e., 
non-user defined relations) which have attributes encoded as 
binary values. 

The 'MCC' function is not correctly implemented for 
negative numbers in ROL or SQL, It returns the modulo class 
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of the argurrent as If the argunrent (nere a positive integer 
and 'attaches' the original sign. For exairple, -l mod 8 = 
-Cl mod 0) = -1. To avoid this inconsistency and to 
correctly implement the mathematical definition, the 
following nested application of mod is required for modulo 
8 : 

mod (mod (ARG, 6) + 8, 8), 

The actual function implemented appears to be a remainder 
function which would be consistent since both query 
languages are implemented in the programming language C, C 
has a remainder function but not a mod function. 
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VI. CONCLUSIONS 



There are three major areas In which DBA must be 
Knowledgeable In order to ensure the successful management 
of a database system. These areas are tne user services# 
performance enhancements# and security factors. The 
specific relational database management system or backend 
machine employed will dictate tne amount of DBA support 
required in each area. 

The user services include the stored commands provided 
by DBA# the loading of data irtc the system# the recovery of 
the database as a result of system malfunction, and a help 
facility. Although these are not comprehensive and the 
exact amount of supp.ort will be discretionary on the part of 
DBA# they do form the nucleus for DBA's planning of user 
support. This support is critical to the acceptance and use 
of the relational database system by the user community. 

The basic tools DBA can use to en nance performance are 
the implementation of the language features# indices# and 
data placement. The performance ennancement which can be 
gained from the query language is only achievable if DBA has 
a solid understanding of the language and how it is 
Implemented, Certain features of tne language will be 
executed faster than others and since there are numerous 
ways to form a query to obtain the same information# 
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knowledge about these characteristics can achieve more rapid 
responses froir the database. Therefore, DBA should review 
user commands in applications programs and provide guidance 



to users 


for 


the purpose 


of exploiting the 


more 


rapid 


features , 


Of 


course, the 


specific features 


will 


vary 



between languages. 

Indices provide another performance tool in databases 
where retrieval and joins are the primary operations. Even 
if these operations are not the most prevalent, Indices may 
still ce employed to enhance performance. If tne database 
has a large number of insertion operations, then avoiding 
the placement of indices on the relations which are changing 
frequently will not result in serious degradation 
attributable to the index updating. Additionally, it 
relations in this type of database which are not subject to 
frequent insertions but are used in numerous retrieves and 
joins can be identified; then placement of indices on these 
relations over the appropriate attributes win enhance tne 
overall performance of the database system. 

The ability to explicitly place aata in the database 
should provide a more responsive system. In order to take 
advantage of data placement DBA must know what relations 
exist in the system and which ores are joined on a recurring 
basis . 

The security aspects on a relational database system 
should be a critical issue for DBA, since a single database 



77 



will be used by various users In the organization, there 
will be data which certain groups of users do not reaulre to 
perform their functions and more Importantly, they should 
not be allowed to access. Although there Is more to 
security than access control, this appears to be the only 
mechanism available to DBA to Implement a security system in 
the database. Consequently, access control should be 
employed to restrict the data available to the users ana 
simultaneously, provide a relational database perspective to 
each user. 

In RDM 1100 there is a trade-off between security, 
performance, and relational perspective. There were three 
methods discussed to provide a single external view of the 
database to a user or group of user. Each of these methods 
required the sacrifice of one of the trade-off features and 
in order to resolve this problem, a change in the 
implementation of ALL is necessary. 

The features and issues discussed In this thesis should 
provide DBA with some guidelines and topics to Investigate 
which will make his database system acceptable ana 
responsive to the users. Although the success or failure of 
any system cannot be real istically placed on a single 
individual. It appears DBA will be more responsloie than any 
other person connected with the system if it does not meet 
the users perceived needs. 
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APPENDIX A 



EXAMPLES OF STORED COMMANDS 



ACCESS-LIST 



DESTROY ACCESS-LIST GO 
DEFINE ACCESS-LIST 

RETRIEVE (RELATION. name, RELATION . TYPE , 

FIELDS = RELATION, ATTCNT, RECORDS = RELAT ION , TUPS , 
USER s USERS.NAME) 

WHERE RELATION, NAME = SO 

AND PROTECT. RELID = RELATION . PELID 
AND PROTECT, USER = USERS. ID 

AND MOD (INTI (SUBSTRING (1, 1, PROTECT . ATTMAP )) , 41 = 1 
END define GO 

ASSOCIATE ACCESS-LIST wITH "RETURNS ACCESS LIST FOR AN 

OBJECT" GO 

ALL-INDICES 



DESTROY ISTATUS GO 

CREATE ISTATUS (STATUS = II, DESC = C30) GO 
APPEND TO ISTATUS (STATUS = 0, 

DESC = "NONUNIQ-NONCLUS-NO DEL SILENT") 
APPEND TO ISTATUS (STATUS = 1, 
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CESC = "UNIG-NCNCLUS-NQ OEL SILENT") 

APPEND TO ISTATUS (STATUS = 2, 

CESC = "NCNUNIC-CLUS-NQ DEL SILENT") 

APPEND TO ISTATUS (STATUS = 3, 

CESC = "UNIO-CLUS-NO DEL SILENT") 

APPEND TO ISTATUS (STATUS = 4, 

CESC = "NCNUMO-NONCLUS-DEL SILENT") 

APPEND TO ISTATUS (STATUS = 5, 

DESC = "UNIG-NCNCLLS-DEL SILENT") 

APPEND TO ISTATUS (STATUS = 6, 

CESC = "NONUNIQ-CLUS-DEL SILENT") 

APPEND TO ISTATUS (STATUS = 1, 

CESC = "UMO-CLUS-CEL SILENT") 

PERMIT READ OF ISTATUS TO ALL 
DENY WRITE OF ISTATUS TO ALL GO 
DESTROY ALL-INDICES GO 
DEFINE ALL-INDICES 

RETRIEVE (REL = REL-NAME ( INDICES .RELiD) , IN D ICES , INDI D , 

INDICES. ATTCNT, ISTATUS , CESC ) 
ORDER BY REL-NAME ( INDICES . HELID ) 

WHERE ISTATUS. STATUS = MCC (MOD ( INDICES . STAT , 8) + 8. 8) 
AND RELATION. RELID = IND ICES . REL ID 
AND RELATION. TYPE = "U" 

END define go 

ASSOCIATE ALL-INDICES WITH "LIST ALL INDICES ON USER 

RELATIONS" GO 
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ATTAIN. INCXl 



DESTROY ATT-IN.INDXl GO 
DEFINE ATT.IN.INDXl 
RETRIEVE (INDICES, INDID, 



ATTl = 


FIELD-NAME 


(INDICES, RELID, 


INTI (SUBSTRING 






C4, 1, 


INDICES. KEYS) ) ) , 


ATT2 = 


FIELD-NAME 


(INDICES. RELID, 


INTI (SUBSTRING 






(14, 1, 


INDICES. KEYS) ) ) , 


ATT3 = 


FIELD-NAME 


(INDICES. relid. 


INTI (SUBSTRING 






(24, 1, 


INDICES. KEYS) ) ) , 


ATT4 s 


FIELD-NAME 


(INDICES. RELXD, 


INIl (SUBSTRING 






(34, 1, 


INDICES. KEYS) ) ) , 


ATT5 = 


FIELD-NAME 


(INDICES. kElID, 


INTI (SUBSTRING 






(44, 1, 


INDICES.KEYS) ) ) , 


ATT6 = 


FIELD-NAME 


(INDICES .RELIO, 


INTI (SUBSTRING 






(54, 1, 


INDICES.KEYS) ) ) , 


ATT7 s 


FIELD-NAME 


(INDICES. RELID, 


INTI (SUBSTRING 






(64 , 1 , 


INDICES.KEYS) ) ) , 


ATT8 = 


FIELD-NAME 


(INDICES. RELID, 


INTI (SUBSTRING 






(74, 1, 


INDICES.KEYS) ) ) ) 


WHERE INDICES, INDID = 


SO 




AND INDICES. RELID s 


REL-ID (51) 





END DEFINE GO 

ASSOCIATE ATT.IN.INCXl WITH "LISTS NA.mES OF ATTRIBUTES IN 

INDEX" GO 



81 



ATT.IN,INCX2 



DESTROY ATT-IN.INDX2 GO 
DEFINE ATT.IN.INDX2 
RETRIEVE (INDICES, INDID, 

ATT9 = FIELD-NAME ( INC ICES . RELID 

C84, 1, 

ATTIO = FIELD-NAME ( INDICES . RELID 

(94, 1 

ATTll a FIELD-NAME ( IND ICES . RELID 

(104, 1 

ATT12 a FIELD-NAME ( INC ICES . RELID 

(114, 1 

ATT13 a FIELD-NAME ( INC ICES , RELID 

(124, 1 

ATT14 a FIELD-NAME ( INDICES .RELID 

(134, 1 

ATT15 a FIELD-NAME ( I N D ICES . RELID 

(144, 1 

ATT16 a FIELD-NAME ( INDICES .RELID 

(154, 1 

WHERE INDICES, INDID = $0 

AND INDICES, RELID a PEL-ID (Si) 

END DEFINE GO 

ASSOCIATE ATT-IN-INDX2 WITH "LISTS NAMES 



, INTI (SUBSTRING 
INDICES, KEYS) ) ) , 

, INTI (SUBSTRING 
, INDICES, KEYS) )) , 
, INTI (SUBSTRING 
, INDICES, KEYS) )) , 
, INTI (SUBSTRING 
, INDICES. KEYS) )) , 
, INTI (SUESTRING 
, INDICES. KEYS) )) , 
, INTI (SUBSTRING 
, INDICES. KEYS) )) , 
, INTI (SUBSTRING 
, INDICES. KEYS) )) , 
, INTI (SUBSTRING 
, INDICES, KEYS) )) ) 



OF ATTRIBUTES IN 

INDEX" GO 
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DEPENCS 



DESTROY DEPENDS GO 
DESTROY OTYPE GO 



CREATE 


CIYPE (TYPE = UCl 


, DESC = UC15) GO 






APPEND 


TO 


OTYPE 


(TYPE 




"U", 


DESC = "USER TABLE 


") 


GO 


APPEND 


TO 


OTYPE 


(TYPE 


= 


"S", 


DESC = "SYSTEM TABLE 


") 


GO 


APPEND 


TO 


OTYPE 


(TYPE 


= 


"T" , 


CESC = "TRANSACTION LOG 


") 


GO 


APPEND 


TO 


OTYPE 


(TYPE 


3 


"F" , 


CESC = "FILE 


") 


GO 


APPEND 


TO 


OTYPE 


(TYPE 


3 


"V" , 


CESC = "USER VIEW 




GO 


APPEND 


TO 


OTYPE 


(TYPE 


= 


"C" , 


CESC = "STORED COMMAND 


") 


GO 


DENY WR 


ITE 


: OF OTYPE GO 













DENY BEAD OF CTYFE GO 
DEFINE DEPENDS 

RETRIEVE (OBJECT = RELATION , NAI^E , WHICH. IS. A = 

STRING (15, CTYFE. CESC), DEPENDS. ON = Si) 
WHERE CRCSSREF. RELIC = REL-ID (SI) 

AND CRCSSREF, DRELIC = REL AT IGN . RELID 
AND CIYFE.TYFE = RELATION , TYPE 
END define GO 

associate depends with "LISTS DEPENDENCIES OF THE NAHEO 

OBJECT" GO 



FIELDS 



DESTROY FIELDS GO 
DESTROY FIELD. EGUIV GO 
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CREATE FIELD.EQUIV (NAME = UC4, NUM s II) GO 
APPEND TO FIELD-EGUIV (NAME = "FLT , NUM = 35) GC 

APPEND TO FIELD. EGUIV (NAME = ''BIN % NUM = 45) GC 

APPEND TO FIELD-EGLIV (NAME = "CHAR", NUM = 47) GO 
APPEND TO FIELD-EQUIV (NAME = "InT ", NuM = 48) GC 

APPEND TO FIELD-EOUIV (NAME = "IM ", NUM s 52) GC 

APPEND TO FIELD-EOUIV (NAME = "INT ", NUM = 56) GC 

DEFINE FIELDS 

RETRIEVE (TABLE = RELATION . NAME , FIELD = ATTP IBUTE . N AME , 

TYPE = FIELD-EOUIV. NAME, LEN = AT TR lEUIE , LEN ) 
WHERE ATTRIBUTE. RELID = REL AT ION . RELID 
AND RELATION. NAME = STABLE-NAME 
AND FIELD-EOUIV .NUM = ATT R I B UTE . TYPE 
END DEFINE GC 

ASSOCIATE FIELDS WITH "RETURNS ALL FIELDS IN THE NAMED 

RELATION" GC 



HELP 



HELP-PEL 



OBJECT LINE-NC DESCRIPTION 



ATT-IN-INDXl 



1 THIS IS A STORED COMMAND WHICH HAS 

2 TWC PARAMETERS. THE FIRST PARA- 

3 METER IS THE INDEX ID NO. AND THE 

4 SECOND IS THE RELATION NAME. 
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5 THESE PARAMETERS MUST BE SEPAR- 

6 ATEC BY COMMAS, THIS COMMAND 

7 PRCVIDES TrtE ATTRIBUTE NAMES OF 

8 EACH attribute IN THE INDEX FOR 

9 THE GIVEN RELATICN CR VIEW. TO 

10 EXECUTE THIS COMMAND JUST TYPE 

11 IN "HELP" followed BY THE OBJECT 

12 NAME AND "GO", 



DESTROY HELP GO 
DEFINE HELP 

RETRIEVE (HELP.REL.DESCRIPTICN) 

ORDER BY HELP-REL.LINE.NC : A 
WHERE HELP-REL, OBJECT = SCSJECTNAME 

AND FRCTECT. RELID = FEL-ID (HELP-REL .CBJECT ) 

AND PROTECT, USER = USERID () 

AND (MCD (INTI (SUBSTRING (1, 1, PROTECT , ATTM A P D) , 

4) = 1) 



END DEFINE GO 

PERMIT EXECUTE OF HELP TO ALL 

ASSOCIATE HELP WITH "PROVIDES INFORMATION ABOUT THE OBJECT 

PASSED AS A PARAMETER" GO 



INDEX-LIST 



DESTROY INDEX-LIST GO 
DEFINE INDEX-LIST 
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RETRIEVE (RELATION. NAVE, INC ICES . INDIO , INDICES , ATTCM , 

ISTATUS.DE5C) 

ORDER BY INDICES. INDID 
where INDICES. relid = RELATION . RELIC 
AND RELATION. NAME = $0 

AND ISTATUS. STATUS = MOD (MOD ( INC ICES . ST AT , 8) 

+ 8 , 8 ) 

END DEFINE GO 

ASSOCIATE INDEX. LIST WITH "LIST INDICES ON NAMED 

RELATION/VIEW" GO 

PROTECTION 



DESTROY PTYPE GO 
DESTROY ATYPE GO 

CREATE PTYPE (ACCESS = II, DESC = UC15) GO 



APPEND 


TO 


PTYPE 


(ACCESS 


S 


1, DE 


SC = 


"HEAD 


") 


APPEND 


TO 


PTYPE 


(ACCESS 


= 


2, DESC = 


"aRITE 


") 


APPEND 


TO 


PTYPE 


(ACCESS 


= 


3, DESC = 


"ALL 




APPEND 


TO 


PTYPE 


(ACCESS 


5 


-32, 


DESC 


= "EXECUTE 




APPEND 


TO 


PTYPE 


(ACCESS 


= 


-53, 


DESC 


= "CREATE 


DATABASE" ) 


APPEND 


TO 


PTYPE 


(ACCESS 


s 


-56, 


DESC 


= "CREATE 


") 


APPEND 


TO 


PTYPE 


(ACCESS 


r 


-58, 


DESC 


= "CREATE 


INDEX ") 



PERMIT READ OF PTYPE GO 
DENY WRITE OF PTYPE GO 

CREATE ATYPE (ACCESS = II, DESC = UC8) GO 
APPEND TO ATYPE (ACCESS = 1, DESC = "PERMIT ") 
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APPEND TO ATYPE (ACCESS = 2 , CESC = "DENY ") 

APPEND TC ATYPE (ACCESS = 3, CESC = "BOTH ") GG 

PERMIT READ CF ATYPE GC 

DENY WRITE OF ATYPE GO 

DESTROY PROTECTION GO 

DEFINE PROTECTION 

RETRIEVE (ACCESS = CONCAT (ATYPE. CESC, PTYPE.DESC), 

OBJECT = RELATION, NAME, USER = USERS, NAME) 
WHERE ATYPE. ACCESS = mCD (IMl (SUBSTRING (1, 1, 

PROTECT, ATTMAF) ) , 4) 
AND PTYPE, ACCESS = PROTECT , ACCESS 
AND RELATION, RELID = PROTECT .RELID 
AND RELATION. NAME = $0 
AND PRCTECT.USER = USERS.IC 
END DEFINE GO 

ASSOCIATE PROTECTION wITH "DISPLAY PPCTECTICN DATA ABOUT 

THE named RELATION" GO 



TABLES 



DESTROY TABLES GC 
DEFINE TABLES GC 

RETRIEVE (RELATION, NAME, RELATION, TYPE , FIELDS = 

RELATION, ATTCNT, RECORDS = RELATION . TUPS ) 
ORDER BY RELATION, NAME ; A 
»»HERE RELATION, TYPE = SO 
END DEFINE GO 
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ASSOCIATE TABLES V»ITH "RETURNS LIST OF RELATIONS, VIEWS OR 



STORED COMMANDS'* GC 



WHATIS 



DESTROY WHATIS GO 
DEFINE WHATIS 

RETRIEVE (RELATION = REL-NAME (DESCRIPTIONS , RELID ) , 

EXPLANATICN = DESCRIPTIONS . TEXT ) 
»»HERE DESCfilFTIONS. RELIC = REL-ID ($0) 

END DEFINE GO 

ASSOCIATE WHATIS WITH "EXPLAINS WHAT A STORED 

CCMMAND/RELATION DOES/IS" GO 

WHCCPEATES 



DESTROY WHQCREATES GO 
DEFINE WHCCREATES 

RETRIEVE (USERS. NAME, PTYPE.DESC) 

WHERE PROTECT. USER = USERS. ID 

AND (PROTECT. ACCESS s -53 OR PROTECT . ACCESS = -56 OR 

PROTECT. ACCESS = -58) 
AND PROTECT. ACCESS = PTYPE. ACCESS 

AND MOD (INTI (SUBSTRING (1, 1, PROTECT . ATTMAP )) , 

4) = 1 

END DEFINE GO 

ASSOCIATE WHOCREATES WITH "LIST USERS WHO HAVE CREATE 

PERMISSION" GC 
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